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Abstract — The downlink transmission in multi-user multiple- 
input multiple-output (MIMO) systems has been extensively stud- 
ied from both communication-theoretic and information-theoretic 
perspectives. Most of these papers assume perfect/imperfect 
channel knowledge. In general, the problem of channel training 
and estimation is studied separately. However, in interference- 
limited communication systems with high mobility, this problem 
is tightly coupled with the problem of maximizing throughput of 
the system. In this paper, scheduling and pre-conditioning based 
schemes in the presence of reciprocal channel are considered 
to address this. In the case of homogeneous users, a scheduling 
scheme is proposed and an improved lower bound on the sum 
capacity is derived. The problem of choosing training sequence 
length to maximize net throughput of the system is studied. In the 
case of heterogeneous users, a modified pre-conditioning method 
is proposed and an optimized pre-conditioning matrix is derived. 
This method is combined with a scheduling scheme to further 
improve net achievable weighted-sum rate. 

I. Introduction 

Downlink transmission in a multiple antenna setting is 
both a well studied and a complex problem with myriad 
parameters. A natural problem to be studied in this setting is 
to maximize throughput on the downlink while constraining 
the complexity at the terminals to be minimal. The problem 
of multi-antenna downlink transmission has been previously 
studied from many different perspectives [l]-[4]. In many 
of these papers, the channel is assumed to be known a- 
priori either perfectly or imperfectly at the base-station and/or 
terminals. The distinguishing feature of this paper is that we 
study the problem with no assumptions on channel knowledge 
both at the base-station and terminals (users). In addition, 
we consider very realistic and difficult communication regime 
when the forward SINRs are low (w dB). We consider this 
regime since interference from neighboring base-stations does 
not allow one to make SINRs larger. Specifically, the scenario 
we study is the following: an ill-element antenna array at the 
base-station, and single antennas at the K{< M) autonomous 
terminals as shown in Fig. [T] The channel is assumed to 
undergo block fading with a coherence interval of T symbols. 
A time-division duplex (TDD) operation is considered. In a 
TDD system, the reverse channel and forward channel share a 
reciprocity relationship. Our system model is a generalization 
of the system model considered in [5]. We look at the net 
impact of training, estimation, scheduling and pre-conditioning 
on the throughput of the system. 
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Fig. 1. Multi-User MIMO TDD System Model 



The use of multiple antennas instead of single antennas at 
the transmitter and receiver in a point-to-point communication 
system has been shown to greatly improve the capacity of 
the wireless channels [6], [7]. Later, the sum capacity of the 
multiple-input multiple-output (MIMO) Gaussian broadcast 
channel has been shown to be achieved by dirty paper coding 
(DPC) [2], [8], [9]. Recently, it was shown that DPC actually 
characterizes the full capacity region of the MIMO Gaussian 
broadcast channel [10]. In addition to the assumption that 
channel is perfectly known at the transmitter and the receivers, 
DPC scheme requires enormous computational power making 
it challenging to implement in practice. Motivated by this, 
many precoding and scheduling schemes have been proposed 
to obtain near-optimal performance with low complexity in 
certain scenarios [11]-[14]. However, these schemes are not 
applicable to the scenario we consider. 

We first look at the homogeneous users scenario, where 
all users have same forward and same reverse signal to 
interference-plus-noise ratios (SINRs), and obtain a rigorous 
lower bound on the sum capacity. The lower bound obtained 
in this paper is tighter than the lower bound given in [5]. The 
improvement comes from the scheduling strategy used which 
is simple, and in fact even considerably reduces the computa- 
tional complexity of pre-conditioning. In this context, we also 
study the problem of optimizing the training sequence length 
and the number of users to maximize net throughput of the 
system. Next, we look at the more general heterogeneous users 
scenario and study the problem of maximizing achievable 
weighted-sum rate. We propose a modified pre-conditioning 
method and obtain an optimized pre-conditioning matrix under 
Af -large assumption. We combine this method with a simple 
scheduling strategy to take advantage of instantaneous channel 



variations. 

We organize the remaining sections of this paper as follows. 
Section HIl describes the system model and Section Hill explains 
the reciprocal training used. Section |IV] and Section IV] describe 
the schemes proposed to increase achievable sum/weighted- 
sum rate in homogeneous and heterogeneous scenarios, re- 
spectively. We provide numerical results in Section |VI] and 
discuss our conclusions in Section IVIII 

A. Notations 

In this paper, bold font variables denote vectors or matrices. 
All vectors are column vectors. (•)-^, (•)*, (•)t and fr(-) denote 
transpose, conjugate, Hermitian and trace, respectively. E[-] 
and var{ } stand for expectation and variance operations, 
respectively. diag{a.} stands for the L x L diagonal matrix 
with diagonal entries equal to the L components of a. 

II. Model Description 

The base-station with M antennas communicates with the 
K independent users on both forward and reverse links as 
shown in Fig. [T] The forward channel is characterized by 
K X M propagation matrix H. We assume independent 
Rayleigh fading channels, which remains constant over a 
duration of T symbols called the coherence interval. The 
entries of the channel matrix H are independent and identically 
distributed (i.i.d.) zero-mean, circularly-symmetric complex 
Gaussian CN{Q, 1) random variables. Our model incorporates 
frequency selectivity of fading by using orthogonal frequency- 
division multiplexing (OFDM). Note that the duration of the 
coherence interval in symbols is chosen for the OFDM sub- 
band. Due to reciprocity, we assume that the reverse channel 
at any instant is the transpose of the forward channel. 

Let the forward and reverse SINRs associated with fc*'' user 
be pfk and prk, respectively. These forward and reverse SINRs 
remain fixed throughout the channel uses. On the forward link, 
the signal received by the k*^ user is 



Xfk 



Wfk 



(1) 



where is the k*^ row of the channel matrix H and Sf 
is the M X 1 vector in which information symbols to be 
communicated are embedded. The components of the additive 
noise vector [wfi Wf2 ■ ■ ■ WfK] are i.i.d. CN{0, 1). The aver- 
age power constraint at the base-station during transmission 
is E[||sf|p] = 1 so that the total transmit power is fixed 
irrespective of its number of antennas. On the reverse link, 
the vector received at the base-station is 



(2) 



where Sr is the signal-vector transmitted by the users and = 
diag{[y/pri ^Pr2 ■ ■ ■ \/PrK]^}- The components of the addi- 
tive noise Wr are i.i.d. CN{0, 1). There is power constraint at 
every user during transmission given by E[||srfc||^] = 1 where 
Srk is the fc*'* component of Sr. 



III. Channel Estimation 

Channel reciprocity is one of the key advantages of TDD 
systems over frequency-division duplex (FDD) systems. We 
exploit this property to perform channel estimation by trans- 
mitting training sequences on the reverse link. Every user 
transmits a sequence of training signals of r^p symbols dura- 
tion in every coherence interval. We assume that these training 
sequences are known a-priori to the base-station. The fc*'' 
user transmits the training sequence vector y^rv^V'!- 
orthonormal sequences which implies = Sij where Sij is 
the Kronecker delta. The use of orthogonal sequences restricts 



the maximum number of users to r^p, i.e., K < t, 



' rp- 



The corrupted training signals received at the base-station 



(3) 



where Trp x K matrix ^ — il)2 ■ ■ ■ V'a'] and the compo- 
nents of M X Trp additive noise matrix Vr are i.i.d. CiV(0, 1). 
The base-station obtains the LMMSE (linear minimum-mean- 
square-error) estimate of the channel 



H = diag ■ 



i + PrlTrp i + PrRTrp 



(4) 



This estimate H is the conditional mean of H and hence, 
the MMSE estimate as well. By the properties of condi- 
tional mean and joint Gaussian distribution, the estimate H 
is independent of the estimation error H = H — H. The 
components of H are independent and the elements of its 
fc*'* row are CN ^0, T^^r^T"^ I" addition, the components 
of H are independent and the elements of its k*^ row are 
CN (o,j^ V 

IV. Homogeneous Users 

In this section, we focus on the special case where forward 
SINRs from the base-station to all users are equal and also 
reverse SINRs from all users to the base-station are equal, 
i.e., pji = • • • = pfK = pf and p^i = ■ ■ ■ = pru = Pr- 

A. Scheduling and P re-Conditioning on Forward Link 

The base-station selects N{< K) users among the K users 
and pre-condition the information signals to be transmitted 
to these N users. The scheduling strategy used to select 
the users is explained in Section IIV-CI Let the set of users 
selected be S* C {1, 2, • • • , K} with N distinct entries. The 
base-station forms the M x 1 transmission signal-vector s/, 
which drives the antennas, from the information symbol-vector 
q = [91 92 ■ • ■ In]^ for the selected users by pre-multiplying 
it with a pre-conditioning matrix. We use the pre-conditioning 
matrix 



A. = 



Hi 



(5) 



\tr 



which is proportional to the pseudo-inverse of the estimated 
channel. The N x M matrix H5 is formed by the rows in set 



S of matrix H. We use this pre-conditioning matrix because 
of the lack of any channel knowledge at the users. The pre- 
conditioning matrix is normalized so that fr(A^As) = 1. 
The transmission signal-vector is given by 



Sf = Asq 



(6) 



and the power constraint at the base-station is satisfied by 
imposing the condition E[||g„||^] = 1, Vn € {1, • • • , N}. From 
(HJ and (|6l), we obtain the signal-vector received at the selected 
users to be 

^/ = \/p7 HsA^q + w/ (7) 

where H5 is the matrix formed by the rows in set S of the 
matrix H. 

B. Lower Bound on Sum Capacity 

In this section, we obtain a lower bound on the sum 
capacity of the system under consideration. The approach is 
similar to that in [5], [15]. The lower bound holds for any 
scheduling strategy used at the base-station which selects a 
fixed number of users. Recall that the base-station performs 
channel estimation as described in Section Hill 

Theorem 1: For the system under consideration, every se- 
lected user can achieve a downlink rate during data transmis- 
sion of at least 



-lb = logs 1 



P/E^ [X] 



1 + ^/ T+^+var{x}, 



(8) 



bits/transmission where x is the scalar random variable given 

-1I 



by X 



tr 



Proof: Let H5 be defined as the matrix formed by the 
rows in set S of the matrix H. The N x N effective forward 
channel matrix in (|7]i is 



G 



(9) 



(10) 



/pf' H5A5 

= Vp7 (HsAs + HsAs 

= VpI {x^n + HsAs) . 

From ^ and we can write the signal received by the n*'* 
user as 

Xfn = gl<l + Wf„ (11) 

where is the n*^ row of G. From (fTOl l. we obtain 

SrT = Vp7 (xe^ + ^LAs) (12) 

where „ is the n*'* row of Hg and is the x 1 vector 
with n*'* element equal to one and all other elements equal to 
zero. From ( fT2b . we obtain 



and 



E [g^g;] ^pjiE [x'] 



1 



1 + PrT, 



rp 



(13) 



(14) 



Adding E[g^] to and subtracting E[g^] from g^ in ( fTTI ). 
we obtain 



Xfn = E [g,^] q + g^q + Wfn 

= ^/pJE[x]qn+Wfn (15) 

where g^ = g^ — IE[g^] and lifn is the zero-mean effective 
noise. Since the signal q is independent of g^ and E[g^] = 0, 
the signal is uncorrected with the effective noise. Using 
( fT3] l and (O, we obtain the variance 

var {wf„} = E [Cqq^C] + E [\\wf„f] 
= E[g,TE[qqt|g„]g;] +E[ 
= E[g^g:] -E[g^]E[g:j 

Pf ( -,.!_ + var Ix} 



-E[l|w/. 



1 



1 



(16) 



Under the assumption that the users are aware of the 
scheduling strategy, E [x] is known to the users. We obtain 
a lower bound on the downlink capacity of every selected 
user during data transmission by assuming worst-case noise 
distribution, which is uncorrelated Gaussian noise with same 
variance [15]. Thus, from (fTST i and ( fTSI l, we obtain (jSJ which 
completes the proof. ■ 

Corollary 1: For the system with homogeneous users con- 
sidered, a lower bound on the sum capacity is 

Csum-lb = max N ■ Cmd-lb- (17) 

N<K, N&+ 

C. Scheduling Strategy 

The need for explicit scheduling arises due the use of 
pseudo-inverse based pre-conditioning of the information sym- 
bols. With perfect channel knowledge at the base-station (H = 
H) and no scheduling {N ~ K), the pseudo-inverse based 
pre-conditioning diagonalizes the effective forward channel 
and every user sees statistically identical effective channel 
irrespective of its actual channel. The inability to vary the 
effective gains to the users depending on their channel states 
is due to lack of any channel knowledge at the users. This 
possibly causes a reduction in achievable sum rate. Motivated 
by this, we propose a scheduling strategy which explicitly 
selects N < K users before pre-conditioning. 

In every coherence interval, the channel estimate at the base- 
station is used to select the N users with largest estimated 
channel gains. Let h^^j, h^), • • • , ^Jk) norm-ordered 
rows of the estimated channel matrix H. Then, the matrix H5 
is given by Hs = [h(i) h(2) 
in (O becomes 



Cind—l 



log2 



Pf 



■ ■ h(Ar)]^ and the lower bound 

Pf (tI^) W 

var{r/} 

(18) 
where 
with largest 



1+PrT,., 



(uut)- 



Here, the random variable 77 = ytr 
U is the N X M matrix formed by ttie N rows 
norms of a x M random matrix Z whose elements are 
i.i.d. CN{0, 1). We provide numerical results showing the 
improvement obtained by using this strategy in Section |Vl] 



D. Net Achievable Sum Rate 

Net achievable sum rate accounts for the reduction in 
achievable sum rate due to training. In every coherence interval 
of T symbols, first r^p symbols are used for training on reverse 
link, one symbol is used for computation (same assumption 
as in [5]) and the remaining T — Trp — 1 symbols are used for 
transmitting information symbols. The number of users K and 
the training length Trp can be chosen such that net throughput 
of the system is maximized. Thus, net achievable sum rate is 
defined as 



C„et(M, Pf,Pr) 



max 

A',r,,„ 



T — T — 1 
^ 'rp ^ ^ 

7^ 



^-lb{■) (19) 



subject to the constraints Trp < T — 2 and K < min(Af, Trp). 
Csum-ibi-) in (O is given by ^IT}. 

V. Heterogeneous Users 
In this section, we consider the general setting described 
in Section with heterogeneous users. Moreover, we study 
the problem of maximizing achievable weighted-sum rate. 
The motivation behind this problem is that many algorithms 
implemented in layers above physical layer assign weights 
to each user depending on various factors. We assume that 
these weights are pre-determined and known. We propose a 
modified pre-conditioning method and derive an optimized 
pre-conditioning matrix under M -large assumption. We further 
combine this with a scheduling strategy to obtain an improved 
lower bound on the weighted-sum capacity. 

A. Modified Pre-Conditioning 

The base-station obtains the M x 1 transmission signal- 
vector by pre-multiplying the information symbols q = 
[qi q2 ■ ■ ■ Qk]'^ with a pre-conditioning matrix as explained in 
Section ITV-AI We propose a modified pre-conditioning matrix 
given by 



H 



(20) 



Itr 



where H^) = DH and D = diag | ^ ■ ■ ■ | . 
The choice of D is explained in Section IV-CI From ([T]), we 
obtain the signal-vector received at the users 

X/ = E/HA£,q + w/ (21) 

where = diag{[^/pj^ ^/pj^ ■ ■ ■ ^/pJkY}- 

B. Lower Bound on Weighted-Sum Capacity 

In this section, we generalize the lower bound derived in 
Section HV-BI to heterogeneous users and weighted-sum rate. 

Theorem 2: For the system under consideration, a lower 
bound on the downlink weighted-sum capacity during trans- 
mission is given by C^t-ib 



K 

= ^ Wfc l0g2 
fc=l 



1 



Pfk 



1 + Pi-kTrp 



-Pfevar{(/)F} , 



(22) 



Here, the random variable (f)F is given by 
= (fr (FZZ^F)"^ 



prK Trp 



(23) 



and 



where F = D • diag , » / , , , ; , 

1 LV i+P'-i'^'-p V 

Z is the K x M random matrix whose elements are i.i.d 

CiV(0,l). 

Proof: The effective forward channel in i2T[ is 
G ^ BfHAo 



E/ [D-^ilDAD+ilAD 
E/ (0fD-i +HAi) 



(24) 



The remaining steps in this proof are similar to those in the 
proof of Theorem [1] and hence, we skip it. ■ 

C. M-large Asymptotics and Optimization of Pre-Conditioning 
Matrix 

We wish to choose the matrix D such that Cwt-ib in 
(|22] | is maximized. However, this problem is hard to analyze. 
We consider the asymptotic regime M/K ^ 1. Apart from 
making the problem mathematically tractable, this asymptotic 
regime is interesting due to the following two reasons, i) In 
our system model, we observe that extra base-station antennas 
are always beneficial from numerical results given in Section 
IVII This observation was first made for homogeneous users in 
[5]. ii) The system imposed constraints K < Trp and Trp < T 
restrict the value of K. 



It is known that lim 



M/K- 



, ZZt Mix where Z is the 
K X M random matrix whose elements are i.i.d. CN(0, 1). 
Therefore, under M-large approximation, random variable (jip 
in ( |23]) can be approximated to 



M 



which is a constant. Substituting ( |25] l in (l22T i. we get 



(25) 



K 



( 



\ 



Cwt-ib ~ J(p) = ^ lo; 



g2 



K 



\ 



(26) 



where a, = (-Enlii—') i and Bi = -r, m ^ 

Theorem 3: Let p = [pi P2 ■ ■ ■ Pk]^ be any vector of non- 
negative real numbers and p* — argmaxp J(p) then the set 
of possible values of p* — cp* where c is any positive real 
number and p* — \p1P2 ■ ■ ■ P*kV ^^^ch that 



\*ai 



1 



(27) 



The positive real number A* is chosen such that the constraint 

K 

^ aip* = 1 is satisfied. 

i=l 

Proof: We use Lagrange multipliers to obtain this result. 
Due to lack of space, we do not include the proof here. ■ 



The optimized p* given by jZTl ) is substituted in ( [20] i to 
obtain the optimized pre-conditioning matrix. We use this 
optimized pre-conditioning matrix even when K is comparable 
to M. 

D. Scheduling Strategy 

In our system model, the optimized values p* cannot depend 
on the instantaneous channel as no channel information is 
available to the users. Hence, we need explicit selection of 
users to take advantage of the instantaneous channel varia- 
tions. In this section, we propose a scheduhng strategy for 
heterogeneous users. 

Let zf^, z|^, • • • , be the rows of the matrix 



Z = diag 




Prl ^rp 



1 + PrKTrp 
PrK^rp 



H (28) 



where H is the estimated channel given by (|4|. In every coher- 
ence interval, the users are ordered such that p^j^^||z^^|p > 

P(2)ll^^)ll^ > ••• > and information symbols 

are transmitted to the first N users using the pre-conditioning 
matrix formed by the appropriate rows of the optimized pre- 
conditioning matrix as described in Section IV-CI The value of 
N is chosen in order to maximize achievable weighted-sum 
rate. We denote this lower bound on achievable weighted-sum 
rate with scheduling by C^'j_j^(-). 

E. Net Achievable Weighted-Sum Rate 

We define net achievable weighted-sum rate as 



Cwt~net{M, K, Pf,Pr) 



T 



1 



T 



C^1-d-) (29) 



subject to the constraints r^p > K and r^p < T — 2. 

VI. Numerical Results 

We provide numerical results in both homogeneous and 
heterogeneous users scenarios to show the performance ben- 
efits obtained using the various proposed schemes. We are 
interested in the realistic communication regime when forward 
and reverse SINRs are low. We consider this regime since 
interference from neighboring base-stations force systems to 
operate in this regime. Moreover, we are interested in high 
mobility users. Hence, we choose the system parameters for 
these scenarios. 

A. Homogeneous Users 

We consider forward SINR of dB and reverse SINR 
Pr of —10 dB. First, we keep the training sequence length 
equal to the number of users, i.e., r^p = K. In Fig.|2] we plot 
lower bound on the sum capacity with scheduling (Scheme- 
1) and without scheduling (Scheme-0) for M — {4,8,16} 
and K = {1,2,- •• ,M}. Note that Scheme-0 is the lower 
bound obtained in [5]. The proposed scheme gives significant 
improvement which implies that the scheme is capable of 
performing opportunistic scheduling. Next, in Fig. [3] we plot 
net achievable sum rate versus M for T — {20, 30}. We 
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Fig. 2. Lower bound on the sum capacity witli scheduling (Scheme-1) and 
without scheduling (Scheme-0) 
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Fig. 3. Net achievable sum rate with scheduling (Scheme-1) and without 
scheduling (Scheme-0) 



observe that the net achievable sum rate increases with M 
for all schemes. As expected, the proposed scheduhng scheme 
(Scheme-1) outperforms Scheme-0. 

In Fig. m we plot the optimum values t*^, N* and K*, 
which maximize net throughput, versus forward SINR for T = 
20 symbols. Here, we fix the reverse SINR to be 10 dB less 
than the forward SINR. In all the cases plotted, the optimized 
value of the number of users K* = t*^. In Fig.lH we observe 
that the scheduling gains are more at low SINRs. 

B. Heterogeneous Users 

We consider a multi-user system consisting of if = 8 
users with forward SINRs {-4, -3, -2,-1, 0, 1, 2, 3} dB and 
coherence interval T = 20 symbols. We assume that the 
reverse SINR associated with every user is 10 dB lower than 
its forward SINR. Next, we assign a weight of 2 to the first 
four users and unit weight to the remaining users. The plot in 
Fig. |5] of net achievable weighted-sum rate versus M clearly 
shows that using more antennas at the base-station is bene- 
ficial. Scheme-2 denotes optimized pre-conditioning with no 
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Fig. 4. Optimum values of parameters versus forward SINR 
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Fig. 5. Net achievable weighted-sum rate with optimized pre-conditioning 
(Scheme-2) and this combined with scheduling (Scheme-3) 



scheduling and Scheme-3 denotes optimized pre-conditioning 
combined with scheduling. We observe that Scheme-3 gives 
the best performance. The performance gain due to scheduling 
is very significant when the number of users are comparable 
to the number of base-station antennas. 

VII. Conclusion 

Our results show that even in interference-limited and highly 
mobile conmiunication systems, the effective use of multiple 
antennas at the base-station greatly improve net downlink 
throughput in multi-user setting. We conclude that it is ad- 
vantageous to increase the number of base-station antennas 
in the system model we considered. Reciprocal training made 
feasible by time-division duplex (TDD) operation is key to this 
result. With increase in the number of base-station antennas, 
the effective forward channel improves whereas the training 
sequence length required is not affected. The training sequence 
length has significant impact on the net throughput of mobile 
systems and hence, it is important to optimize it depending on 



various system parameters as discussed in the paper. 

In multi-user multiple antenna systems, scheduUng and 
pre-conditioning are practical schemes that can potentially 
improve the net throughput of these systems. We proposed 
scheduling schemes in both homogeneous and heterogeneous 
users scenarios and showed that these schemes significantly 
improve achievable sum/weighted-sum rate. The optimized 
pre-conditioning derived is applicable to the generic case 
with arbitrary set of weights, forward and reverse SINRs. 
Also, the optimization involved is computationally simple and 
can be implemented efficiently. As future work, we plan to 
extend these ideas to design a cellular network with aggressive 
frequency reuse supporting high mobility and high downlink 
rates. 
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